A Neural Learning Approach for Duration Parameter Generation in Mandarin Speech Synthesis

نویسنده

HUANG Yan

چکیده

In this paper, a neural learning approach is investigated, which is designed to generate duration parameter for mandarin speech synthesis. Unlike traditionally used rule-based methods, the novelty of this method lies in that it combines neural learning strategy and prior linguistic knowledge to obtain duration parameter. Rules generalized by linguists are used to encode input vectors of the neural network, and five multi-layer neural networks are built to determine the duration parameter for each tonal syllable. Experiment results show that it is a flexible and effective way to determine duration parameter, and perhaps it provides a helpful way of thought to obtain other prosodic parameters for speech synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Duration and Intonation in Mandarin Chinese Synthesis with a Neural Network

The prosody control plays an important role in the naturalness of synthesized speech. In previous work, great efforts have been made to generate rule-based or parameter-based prosodic models [6]. In order to capture the complex interaction of different relevant prosodic factors, neural networks were recently employed. This paper presents a new method of learning and modeling duration and intona...

متن کامل

A Corpus-Based Prosodic Modeling Method for Mandarin and Min-Nan Text-to-Speech Conversions

This talk gives an introduction to a recurrent neural network (RNN) based prosody synthesis method for both Mandarin and Min-Nan text-tospeech (TTS) conversions. The method uses a fourlayer RNN to model the dependency of output prosodic information and input linguistic information. Main advantages of the method are the capability of learning many human’s prosody pronunciation rules automaticall...

متن کامل

An NN-based Approach to Prosodic for Synthesizing English Words Em

In this paper, a neural network-based approach to generating proper prosodic information for spelling/reading English words embedded in background Chinese texts is discussed. It expands an existing RNN-based prosodic information generator for Mandarin TTS to an RNN-MLP scheme for Mandarin-English mixed-lingual TTS. It first treats each English word as a Chinese word and uses the RNN, trained fo...

متن کامل

Improved generation of prosodic features in HMM-based Mandarin speech synthesis

The HMM-based Text-to-Speech System can produce high quality synthetic speech with flexible modeling of spectral and prosodic parameters. However, the prosodic features, like F0 and duration trajectories, generated by HMM-based speech synthesis are often excessively smoothed and lack prosodic variance. In HMM-based TTS durations are typically modeled statistically using state duration probabili...

متن کامل

A Mandarin Text-to-Speech System

In this paper, the implementation of a high-performance Mandarin TTS system is presented. The system is composed of four main parts: text analysis (TA), prosodic information generation (PIG), waveform table (WT) of 411 base-syllables, and PSOLA-based waveform synthesis (PSOLA). In TA, a statistical model based method is rst employed to automatically tag the input text to obtain the word sequenc...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

A Neural Learning Approach for Duration Parameter Generation in Mandarin Speech Synthesis

نویسنده

چکیده

منابع مشابه

Modeling Duration and Intonation in Mandarin Chinese Synthesis with a Neural Network

A Corpus-Based Prosodic Modeling Method for Mandarin and Min-Nan Text-to-Speech Conversions

An NN-based Approach to Prosodic for Synthesizing English Words Em

Improved generation of prosodic features in HMM-based Mandarin speech synthesis

A Mandarin Text-to-Speech System

عنوان ژورنال:

اشتراک گذاری